Title of Thesis: Learning Structured Classifiers for Statistical Dependency Parsing Learning Structured Classifiers for Statistical Dependency Parsing

نویسندگان

  • Qin Iris Wang
  • Dekang Lin
چکیده

In this thesis, I present three supervised and one semi-supervised machine learning approach for improving statistical natural language dependency parsing. I first introduce a generative approach that uses a strictly lexicalised parsing model where all the parameters are based on words, without using any part-of-speech (POS) tags or grammatical categories. Then I present an improved large margin approach for learning dependency parsers from treebank data that allows a more general set of linguistic features to be used. Specifically, I incorporate local constraints that enforce the correctness of each individual link, rather than just scoring the whole parse tree. For dealing with sparse data, I smooth the lexical parameters according to their underlying word similarities using Laplacian regularization. Third, I present a simpler and more efficient approach to training dependency parsers by applying a boosting-like procedure to standard supervised training methods. By using logistic regression as an efficient base classifier (for predicting dependency links between word pairs), I am able to efficiently train a dependency parsing model, via structured boosting, that achieves state-of-the-art results in English, and surpasses state-of-the-art in Chinese. Finally, I propose a novel semi-supervised training algorithm for learning dependency parsers. By combining a supervised large margin loss with an unsupervised least squares loss, I obtain a discriminative, convex, semi-supervised training algorithm for dependency parsing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Structured Classifiers for Statistical Dependency Parsing

My research is focused on developing machine learning algorithms for inferring dependency parsers from language data. By investigating several approaches I have developed a unifying perspective that allows me to share advances between both probabilistic and non-probabilistic methods. First, I describe a generative technique that uses a strictly lexicalised parsing model, where all the parameter...

متن کامل

Second Exam: Natural Language Parsing with Neural Networks

With the advent of “deep learning”, there has been a recent resurgence of interest in the use of artificial neural networks for machine learning. This paper presents an overview of recent research in the statistical parsing of natural language sentences using such neural networks as a learning model. Though it is a fairly new addition to the toolset in this area, important results have been rec...

متن کامل

Advances in discriminative dependency parsing

Achieving a greater understanding of natural language syntax and parsing is a critical step in producing useful natural language processing systems. In this thesis, we focus on the formalism of dependency grammar as it allows one to model important headmodifier relationships with a minimum of extraneous structure. Recent research in dependency parsing has highlighted the discriminative structur...

متن کامل

Logistic Online Learning Methods and Their Application to Incremental Dependency Parsing

We investigate a family of update methods for online machine learning algorithms for cost-sensitive multiclass and structured classification problems. The update rules are based on multinomial logistic models. The most interesting question for such an approach is how to integrate the cost function into the learning paradigm. We propose a number of solutions to this problem. To demonstrate the a...

متن کامل

IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Confidence Estimation in Structured Prediction

Structured classification tasks such as sequence labeling and dependency parsing have seen much interest by the Natural Language Processing and the machine learning communities. Several online learning algorithms were adapted for structured tasks such as Perceptron, PassiveAggressive and the recently introduced Confidence-Weighted learning . These online algorithms are easy to implement, fast t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007